Model selection procedure for high-dimensional data
نویسندگان
چکیده
منابع مشابه
Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملModel Selection in Gaussian Regression for High-dimensional Data
We consider model selection in Gaussian regression, where the number of predictors might be even larger than the number of observations. The proposed procedure is based on penalized least square criteria with a complexity penalty on a model size. We discuss asymptotic properties of the resulting estimators corresponding to linear and so-called 2k ln(p/k)-type nonlinear penalties for nearly-orth...
متن کاملMethods for regression analysis in high-dimensional data
By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...
متن کاملFeature Selection for High-dimensional Integrated Data
Motivated by the problem of identifying correlations between genes or features of two related biological systems, we propose a model of feature selection in which only a subset of the predictors Xt are dependent on the multidimensional variate Y , and the remainder of the predictors constitute a “noise set” Xu independent of Y . Using Monte Carlo simulations, we investigated the relative perfor...
متن کاملAttribute Selection for High Dimensional Data Clustering
We present a new method to select an attribute subset (with few or no loss of information) for high dimensional data clustering. Most of existing clustering algorithms loose some of their efficiency in high dimensional data sets. One possible solution is to use only a subset of the whole set of dimensions. But the number of possible dimension subsets is too large to be fully parsed. We use a he...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistical Analysis and Data Mining
سال: 2010
ISSN: 1932-1864
DOI: 10.1002/sam.10088